RNASeq

Loading report..

Highlight Samples

Regex mode off

    Rename Samples

    Click here for bulk input.

    Paste two columns of a tab-delimited table here (eg. from Excel).

    First column should be the old name, second column the new name.

    Regex mode off

      Show / Hide Samples

      Regex mode off

        Explain with AI

        Configure AI settings to get explanations of plots and data in this report.

        Keys entered here will be stored in your browser's local storage. See the docs.


        Anonymize samples off

        Export Plots

        px
        px
        X

        Download the raw data used to create the plots in this report below:

        Note that additional data was saved in multiqc_data when this report was generated.


        Choose Plots

        If you use plots from MultiQC in a publication or presentation, please cite:

        MultiQC: Summarize analysis results for multiple tools and samples in a single report
        Philip Ewels, Måns Magnusson, Sverker Lundin and Max Käller
        Bioinformatics (2016)
        doi: 10.1093/bioinformatics/btw354
        PMID: 27312411
        Settings are automatically saved. You can also save named configurations below.

        Save Settings

        You can save the toolbox settings for this report to the browser or as a file.


        Load Settings

        Choose a saved report profile from the browser or load from a file:

          Load from File

        Tool Citations

        Please remember to cite the tools that you use in your analysis.

        To help with this, you can download publication details of the tools mentioned in this report:

        About MultiQC

        This report was generated using MultiQC, version 1.31

        You can see a YouTube video describing how to use MultiQC reports here: https://youtu.be/qPbIlO_KWN0

        For more information about MultiQC, including other videos and extensive documentation, please visit http://multiqc.info

        You can report bugs, suggest improvements and find the source code for MultiQC on GitHub: https://github.com/MultiQC/MultiQC

        MultiQC is published in Bioinformatics:

        MultiQC: Summarize analysis results for multiple tools and samples in a single report
        Philip Ewels, Måns Magnusson, Sverker Lundin and Max Käller
        Bioinformatics (2016)
        doi: 10.1093/bioinformatics/btw354
        PMID: 27312411

        RNASeq
        National Facility for Data Handling and Analysis - Human Technopole

        This report includes the results of the RNASeq analysis pipeline developed by the National Facility for Data Handling and Analysis based on the nf-core/rnaseq pipeline, with some modifications to suit our needs. This pipeline performs quality control, alignment, quantification, and differential expression analysis of bulk transcriptomics data.

        This report has been generated by the nfdata-omics/rnaseq analysis pipeline.
        Report generated on 2026-01-27, 13:53 CET based on data in: /scratch/camilla.callierotti/nextflow/96/4eaa7aa2175d2c0935931c96dedf4c

        Sample-to-sample Correlation

        Sample-to-sample Pearsons' correlation is calculated from the CPM values of all expressed genes. The resulting heatmap illustrates the pairwise correlation indices, with red indicating higher correlation and blue indicating lower correlation.

        Created with MultiQC

        Dimensionality Reduction

        Principal Component Analysis (PCA) is a dimensionality reduction algorithm used to summarize the variability structure of an entire dataset into a few latent variables, called principal components. Each component is defined as a linear combination of the original features (for example, genes) and is orthogonal (and thus independent) to all previous components. Components are defined as eigenvector of the covariance matrix of the data, with the first component explaining the largest percentage of variance and subsequent components accounting for progressively smaller proportions of variance.
        Multidimensional Scaling (MDS) is also a dimensionality reduction algorithm that seeks to preserve the pairwise distances between samples when projecting data into a lower-dimensional space. Unlike PCA, which identifies linear combinations of features to maximize variance, MDS is more flexible and works directly with a distance matrix, making it suitable for data where the relationships are non-linear or not well captured by variance. If the relationships between the variables in the dataset are primarily linear and the variance structure is well-defined, results from the two algorithms are likely to converge.
        In a PCA/MDS scatterplot, each dot represents a sample; dots that are closer together indicate more similar samples, while distant dots suggest highly divergent samples or potential outliers. The coordinates of each dot on the plot are calculated by the algorithms in an unsupervised manner. However, dots can be color-coded based on sample characteristics chosen by the users, enabling the identification of whether clusters of samples are associated with specific experimental or technical variables.
        Both analyses were performed on normalized expression values of the top 5000 most variable genes among the expressed ones. The data were preprocessed by centering and scaling to achieve a zero mean and unit variance. The following interactive plots display the distribution of samples in the reduced component spaces and can be used to highlight different sample characteristics as reported in the metadata.

        PCA Scatter Plot with Metadata

        Interactive PCA scatter plot with metadata filtering options.


        MDS Scatter Plot with Metadata

        Interactive MDS scatter plot with metadata filtering options.


        Enrichment analysis

        Over-representation and GSEA analysis.

        Over-representation and GSEA Results

        Interactive table showing enrichment analysis results for each dataset and comparison. Click on any row to expand and view the corresponding plots.

        This table shows the over-representation and gsea plots for each comparison and dataset. Click on any row to expand and see dot plots for Enrichment analysis (with toggles for up/down/all regulated genes) . Dot size represents effect size, and color represents significance.

        Dataset Comparisons Data Summary
        GSE52778_raw_counts_GRCh38.p13_NCBI Treatment:Albuterol:Untreated, Treatment:Albute...
        Over-representation: 3/3 GSEA: 0/3
        Comparison: Treatment:Albuterol:Untreated
        Over-representation: ✓ GSEA: ✗

        Over-representation Analysis

        GSEA Analysis

        Comparison: Treatment:Albuterol_Dexamethasone:Untreated
        Over-representation: ✓ GSEA: ✗

        Over-representation Analysis

        GSEA Analysis

        Comparison: Treatment:Dexamethasone:Untreated
        Over-representation: ✓ GSEA: ✗

        Over-representation Analysis

        GSEA Analysis


        Software Versions

        Software Versions lists versions of software tools extracted from file contents.

        GroupSoftwareVersion
        CLUSTERPROFILER_ORAR4.4.2
        clusterProfiler4.14.4
        enrichplot1.26.6
        msigdbr7.5.1
        openxlsx4.2.8
        optparse1.7.5
        CONTROL_GENE_HEATMAPBiobase2.66.0
        BiocGenerics0.52.0
        GenomeInfoDb1.42.0
        GenomicRanges1.58.0
        IRanges2.40.0
        MatrixGenerics1.18.0
        R4.4.2
        S4Vectors0.44.0
        SummarizedExperiment1.36.0
        ggplot23.5.1
        matrixStats1.4.1
        optparse1.7.5
        pheatmap1.0.12
        stringr1.5.1
        CUSTOM_GETCHROMSIZESgetchromsizes1.21
        DESEQ2_COMPAREBiobase2.66.0
        BiocGenerics0.52.0
        DESeq21.46.0
        GenomeInfoDb1.42.0
        GenomicRanges1.58.0
        IRanges2.40.0
        MatrixGenerics1.18.0
        R4.4.2
        S4Vectors0.44.0
        SummarizedExperiment1.36.0
        ggplot23.5.1
        matrixStats1.4.1
        optparse1.7.5
        stringr1.5.1
        DESEQ2_FITBiobase2.66.0
        BiocGenerics0.52.0
        DESeq21.46.0
        GenomeInfoDb1.42.0
        GenomicRanges1.58.0
        IRanges2.40.0
        MatrixGenerics1.18.0
        R4.4.2
        S4Vectors0.44.0
        SummarizedExperiment1.36.0
        edgeR4.4.0
        limma3.62.1
        matrixStats1.4.1
        optparse1.7.5
        stringr1.5.1
        GENEID_TO_GENENAMEmawknull
        sednull
        GSEAR4.4.2
        clusterProfiler4.14.4
        enrichplot1.26.6
        openxlsx4.2.8
        optparse1.7.5
        GSEA_MERGER4.4.2
        openxlsx4.2.7.1
        optparse1.7.5
        stringr1.5.1
        GTF2BEDperl5.26.2
        GTF_FILTERpython3.9.5
        GUNZIP_FASTAgunzip1.1
        GUNZIP_GTFgunzip1.1
        GUNZIP_TRANSCRIPT_FASTAgunzip1.1
        OBJ_CONSTRUCTIONBiobase2.66.0
        BiocGenerics0.52.0
        GenomeInfoDb1.42.0
        GenomicRanges1.58.0
        IRanges2.40.0
        MatrixGenerics1.18.0
        R4.4.2
        S4Vectors0.44.0
        SummarizedExperiment1.36.0
        matrixStats1.4.1
        optparse1.7.5
        PCA_AND_MDSBiobase2.66.0
        BiocGenerics0.52.0
        GenomeInfoDb1.42.0
        GenomicRanges1.58.0
        IRanges2.40.0
        MatrixGenerics1.18.0
        R4.4.2
        S4Vectors0.44.0
        SummarizedExperiment1.36.0
        edgeR4.4.0
        ggplot23.5.1
        limma3.62.1
        matrixStats1.4.1
        optparse1.7.5
        R_COUNT_NORMBiobase2.66.0
        BiocGenerics0.52.0
        GenomeInfoDb1.42.0
        GenomicRanges1.58.0
        IRanges2.40.0
        MatrixGenerics1.18.0
        R4.4.2
        S4Vectors0.44.0
        SummarizedExperiment1.36.0
        edgeR4.4.0
        limma3.62.1
        matrixStats1.4.1
        optparse1.7.5
        SALMON_INDEXsalmon1.10.3
        SAMPLES_CORRELATIONBiobase2.66.0
        BiocGenerics0.52.0
        GenomeInfoDb1.42.0
        GenomicRanges1.58.0
        IRanges2.40.0
        MatrixGenerics1.18.0
        R4.4.2
        S4Vectors0.44.0
        SummarizedExperiment1.36.0
        ggplot23.5.1
        matrixStats1.4.1
        optparse1.7.5
        pheatmap1.0.12
        stringr1.5.1
        STAR_GENOMEGENERATEgawk5.1.0
        samtools1.2
        star2.7.11b
        UCSC_GTFTOGENEPREDucsc447
        WorkflowNextflow24.04.4
        nfdata-omics/rnaseqv0.0.0dev-gdd86bb6

        nfdata-omics/rnaseq Methods Description

        Suggested text and references to use when describing pipeline usage within the methods section of a publication.URL: https://github.com/nfdata-omics/rnaseq

        Methods

        Data was processed using nfdata-omics/rnaseq v0.0.0dev of the nf-core collection of workflows (Ewels et al., 2020), utilising reproducible software environments from the Bioconda (Grüning et al., 2018) and Biocontainers (da Veiga Leprevost et al., 2017) projects.

        The pipeline was executed with Nextflow v24.04.4 (Di Tommaso et al., 2017) with the following command:

        nextflow run nfdata-omics/rnaseq -r custom_multiqc_test -hub ht_gitlab -params-file nf-params.yml -profile ht_cluster -ansi-log false -resume

        References

        • Di Tommaso, P., Chatzou, M., Floden, E. W., Barja, P. P., Palumbo, E., & Notredame, C. (2017). Nextflow enables reproducible computational workflows. Nature Biotechnology, 35(4), 316-319. doi: 10.1038/nbt.3820
        • Ewels, P. A., Peltzer, A., Fillinger, S., Patel, H., Alneberg, J., Wilm, A., Garcia, M. U., Di Tommaso, P., & Nahnsen, S. (2020). The nf-core framework for community-curated bioinformatics pipelines. Nature Biotechnology, 38(3), 276-278. doi: 10.1038/s41587-020-0439-x
        • Grüning, B., Dale, R., Sjödin, A., Chapman, B. A., Rowe, J., Tomkins-Tinch, C. H., Valieris, R., Köster, J., & Bioconda Team. (2018). Bioconda: sustainable and comprehensive software distribution for the life sciences. Nature Methods, 15(7), 475–476. doi: 10.1038/s41592-018-0046-7
        • da Veiga Leprevost, F., Grüning, B. A., Alves Aflitos, S., Röst, H. L., Uszkoreit, J., Barsnes, H., Vaudel, M., Moreno, P., Gatto, L., Weber, J., Bai, M., Jimenez, R. C., Sachsenberg, T., Pfeuffer, J., Vera Alvarez, R., Griss, J., Nesvizhskii, A. I., & Perez-Riverol, Y. (2017). BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics (Oxford, England), 33(16), 2580–2582. doi: 10.1093/bioinformatics/btx192
        Notes:
        • If available, make sure to update the text to include the Zenodo DOI of version of the pipeline used.
        • The command above does not include parameters contained in any configs or profiles that may have been used. Ensure the config file is also uploaded with your publication!
        • You should also cite all software used within this run. Check the "Software Versions" of this report to get version information.

        nfdata-omics/rnaseq Workflow Summary

        Input/output options

        input
        /facility/nfdata-omics/projects/IU2_024_SORT1-KO-Thyroid_Coscia/results/20251114/sample_sheet.csv
        metadata
        /home/camilla.callierotti/rnaseq_agent_demo/metadata.csv
        outdir
        output

        Count matrix options

        counts
        /home/camilla.callierotti/rnaseq_agent_demo/GSE52778_raw_counts_GRCh38.p13_NCBI.tsv

        Reference genome options

        fasta
        /facility/nfdata-omics/reference/human/gencode/v44/GRCh38.p14.genome.fa.gz
        gtf
        /facility/nfdata-omics/reference/human/gencode/v44/gencode.v44.annotation.gtf.gz
        transcript_fasta
        /facility/nfdata-omics/reference/human/gencode/v44/gencode.v44.transcripts.fa.gz

        Dimensionality reduction and DEA

        comparisons
        Treatment:Dexamethasone:Untreated,Treatment:Albuterol:Untreated,Treatment:Albuterol_Dexamethasone:Untreated,Treatment:Albuterol_Dexamethasone:Dexamethasone
        control_genes_list
        /home/camilla.callierotti/rnaseq_agent_demo/ctrl_genes.txt
        fdr_pathways
        0.1
        frac_expressed
        0.25
        genesets
        /facility/nfdata-omics/reference/human/MSigDB/v2024.1.Hs/c2.all.v2024.1.Hs.entrez.gmt,/facility/nfdata-omics/reference/human/MSigDB/v2024.1.Hs/c5.all.v2024.1.Hs.entrez.gmt,/facility/nfdata-omics/reference/human/MSigDB/v2024.1.Hs/h.all.v2024.1.Hs.entrez.gmt
        lfc_threshold
        0.0
        model_formula
        ~0+Treatment

        Core Nextflow options

        configFiles
        N/A
        containerEngine
        singularity
        launchDir
        /home/camilla.callierotti/rnaseq_agent_demo
        profile
        ht_cluster
        projectDir
        /home/camilla.callierotti/.nextflow/assets/nfdata-omics/rnaseq
        revision
        custom_multiqc_test
        runName
        hopeful_koch
        userName
        camilla.callierotti
        workDir
        /scratch/camilla.callierotti/nextflow